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Abstract 

The alternating direction method of multipliers (ADMM) has been successfully applied to 
solve structured convex optimization problems due to its superior practical performance. The 
convergence properties of the 2-block ADMM have been studied extensively in the literature. 
Specifically, it has been proven that the 2-block ADMM globally converges for any penalty 
parameter 7 > 0. In this sense, the 2-block ADMM allows the parameter to be free, i.e., there 
is no need to restrict the value for the parameter when implementing this algorithm in order 
to ensure convergence. However, for the 3-block ADMM, Chen et al. [4j recently constructed 
a counter-example showing that it can diverge if no further condition is imposed. The existing 
results on studying further sufficient conditions on guaranteeing the convergence of the 3-block 
ADMM usually require 7 to be smaller than a certain bound, which is usually either difficult to 
compute or too small to make it a practical algorithm. In this paper, we show that the 3-block 
ADMM still globally converges with any penalty parameter 7 > 0 when applied to solve a class 
of commonly encountered problems to be called regularized least squares decomposition (RLSD) 
in this paper, which covers many important applications in practice. 
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1 Introduction 


The alternating direction method of multipliers (ADMM) has been very successfully applied to solve 
many structured convex optimization problems arising from machine learning, image processing, 
statistics, computer vision and so on; see the recent survey paper [2]. The ADMM is particularly 
efficient when the problem has a separable structure in functions and variables. For example, the 
following convex minimization problem with 2-block variables can usually be solved by ADMM, 
provided that a certain structure of the problem is in place: 

min /i(xi) +/ 2 (x 2 ) 

s.t. Aixi + A 2 X 2 = b (1) 

x\ £ X\, x 2 £ X 2 , 

where fi{xi) : M ni —>• M p ,z = 1,2, are proper closed convex functions, Ai £ M pXTli ,i = 1,2, b € M p 
and Xj , i = 1,2, are closed convex sets. A typical iteration of the 2-block ADMM (with given 
(x|, A fc )) for solving (JTJ) can be described as 


k+ 1 
x 1 

:= argmin XieAri £ 7 (zi, x$; X k ) 


T k+1 

a 2 

:= argmin^g^ £ 7 (a^ +1 ,x 2 ; X k ) 

(2) 

X k+1 

:= X k - 7 (A lX k+1 + A 2 x k+l - b), 



where the augmented Lagrangian function £ 7 is defined as 

£ 7 (xi,x 2 ; A) := /i(xi) + / 2 (a: 2 ) - (A, A x xi + A 2 x 2 - b) + -||AiXi + A 2 x 2 - b\\%, 

where A is the Lagrange multiplier and 7 > 0 is a penalty parameter, which can also be viewed as a 
step size on the dual update. The convergence properties of 2-block ADMM d2|) have been studied 
extensively in the literature; see for example [samnTOEraiMi]- A very nice property 
of the 2-block ADMM is that it is parameter restriction-free: it has been proven that the 2-block 
ADMM ([2D is globally convergent for any parameter 7 > 0, starting from anywhere. This prop¬ 
erty makes the 2-block ADMM particularly attractive for solving structured convex optimization 
problems in the form of ([ 1 ]). 

However, this is not the case when ADMM is applied to solve convex problems with 3-block vari¬ 
ables: 

min fi{xi) + f 2 (x 2 ) + f 3 (x 3 ) 

s.t. Aixi + A 2 x 2 + A 3 x 3 = b (3) 

x± £ X±,x 2 £ X 2 ,x 3 € A3. 
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Note that the 3-block ADMM for solving ([3]) can be described as 

x k i +1 := argmin xlGiVl £ 7 (x 1 ,x%,x%-,\ k ) 

4 +1 := argmin X2eA . 2 £ 7 (a^ +1 , x 2 , x%; A fc ) ^ 

x 3 +1 : = argmin^g^ £ 7 (xf +1 , z 2 +1 , x 3 ] X k ) 

X k+ 1 := A fc - 7(Ai^ +1 + A 2 X2 +1 + A 3 x 3 +1 - 6), 

where the augmented Lagrangian function is defined as 

£ 7 (x i,x 2 ,x 3 ; A) := fi(xi)+f 2 (x 2 )+f 3 (x 3 )-(\, A 1 xi+A 2 X 2 +A 3 x 3 -b) + ^\\A 1 xi+A 2 X 2 +A 3 x 3 -b\\l. 

Regarding its general convergence however, Chen et al. constructed a counterexample in [4j showing 
that the 3-block ADMM Q can diverge if no further condition is imposed. On the other hand, the 
3-block ADMM (J4|) has been successfully used in many important applications such as the robust 
and stable principal component pursuit problem [33i, .50] , the robust image alignment problem M , 
Semidefinite Programming mi and so on. It is therefore of great interest to further study sufficient 
conditions to guarantee the convergence of 3-block ADMM Q. Han and Yuan [13] showed that the 
3-block ADMM dH) converges if all the functions fi, f 2 , / 3 are strongly convex and 7 is restricted 
to be smaller than a certain bound. This condition is relaxed in Chen, Shen and You [5] and Lin, 
Ma and Zhang [24j to allow only / 2 and f 3 to be strongly convex and 7 is restricted to be smaller 
than a certain bound. Moreover, the first sublinear convergence rate result of multi-block ADMM is 
established in [25]. Closely related to UM, Cai, Han and Yuan [3] and Li, Sun and Toh [22] proved 
the convergence of the 3-block ADMM ([5]) under the assumption that only one of the functions /1, 
/ 2 and f 3 is strongly convex, and 7 is restricted to be smaller than a certain bound. Davis and 
Yin [6] studied a variant of the 3-block ADMM (see Algorithm 8 in [6]) which requires that f\ is 
strongly convex and 7 is smaller than a certain bound to guarantee the convergence. In addition 
to strong convexity of / 2 and f 3 , and the boundedness of 7 , by assuming further conditions on the 
smoothness of the functions and some rank conditions on the matrices in the linear constraints, Lin, 
Ma and Zhang [26] proved the globally linear convergence of 3-block ADMM ([5]). More recently, 
Lin, Ma and Zhang [25] further proposed several alternative approaches to ensure the sublinear 
convergence rate of ([4]) without requiring any function to be strongly convex. Remark that in all 
these works, to trade for a convergence guarantee the penalty parameter 7 is required to be small, 
which potentially affects the practical effectiveness of the 3-block ADMM ([4]), while the 2-block 
ADMM (|2]) does not suffer from such compromises. 

Alternatively, one may opt to modify the 3-block ADMM (j5]) to achieve convergence, with similar 
per-iteration computational complexity as ([4]). The existing methods in the literature along this line 
can be classified into the following three main categories, (i) The first class of algorithms requires 
a correction step in the updates (see, e.g., □sunn nun]). (ii) The second class of algorithms adds 
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proximal terms and/or dual step size to the ADMM updates, i.e., these algorithms change @ to 

x\ +1 := argmin SieAri £ 7 (xi, A fc ) + \\\x - x\\\ Pl 

x k 2 +l := argmin X2eA . 2 £ 7 (x^ +1 , x 2 , A fe ) + \\\x - x%\\p 2 
x 3 +1 : = argmin^g^ £ 7 (^ +1 , x% +1 , x 3 ] X k ) + \\\x - z|||p 3 

k A fc+1 :=X k -a'y(A 1 x k 1 +1 +A 2 x^ +1 +A 3 x^ +1 -b), 

where matrices P, Y 0 and a > 0 denotes a step size for the dual update. Global convergence 
and convergence rate for (JSJ) and its variants (for example, allowing to update x±, x 2 , x 3 in a 
Jacobian manner instead of a Gauss-Seidel manner) are analyzed under various conditions (see, 
e.g., [201 0 US ED [ 22 ]). Note that these works usually require restrictive conditions on Pj, a 
and 7 that may also affect the performance of solving large-scale problems arising from practice. 
Notwithstanding all these efforts, many authors acknowledge that the unmodified 3-block ADMM 
(]4j) usually outperforms its variants ([5]) and the ones with correction step in practice (see, e.g., the 
discussions in eh Eg). (iii) The recent work by Sun, Luo and Ye |32| on a randomly permuted 
ADMM is probably the only variant of 3-block ADMM which does not restrict the 7 value, but its 
convergence is now only guaranteed for solving a squared and nonsingular linear system. 

Motivated by the fact that the 2-block ADMM ([2]) allows the parameter to be free, in this paper 
we set out to explore the structures of 3-block model for which the unmodified 3-block ADMM ([4]) 
converges for all parameter values. Given the superior performance of ([4]), such property is of great 
practical importance. In this paper, we show that the 3-block ADMM (EJ) is globally convergent for 
any fixed 7 > 0 when it is applied to solving a class of convex problems, termed the Regularized 
Least Squares Decomposition (R.LSD) in this paper, which covers many important applications in 
practice as we shall discuss next. 


2 Regularized Least Squares Decomposition 

Let us consider the following problem, to be called regularized least squares decomposition (R.LSD): 

min /i(xi) + f 2 (x 2 ) + ± \\A 1 X 1 + A 2 x 2 - 6|| 2 ^ 

s.t. x\ G Af,x 2 € A 2 , 

where one seeks to decompose the observed data b into two components A\X\ and A 2 x 2 , and f\ 
and f 2 denote some regularization functions that promote certain structures of x\ and x 2 in the 
decomposed terms. One may also view ([ 6 ]) as a data fitting problem with two regularization terms, 
where ||AiXi + A 2 x 2 — 6|| 2 denotes a least squares loss function on the data fitting term. One way 
to solve ([ 6 ]) is to apply the 3-block ADMM ([4]) to solve its equivalent reformulation: 

min /i(xi) + f 2 (x 2 ) + f 3 (x 3 ) 

s.t. Aix\ + A 2 x 2 + x 3 = b, Xi e Xi, i = 1 , 2 , 


4 




where f 3 (xs) = ^ 11^3Hi- M an y works in the literature (including Boyd et al. [2] and Hong, Luo 
and Razaviyayn ED) have suggested to apply ADMM to solve 0 by reformulating it as ( 0 . The 
advantage of using ADMM to solve 0 is that the subproblems are usually easy to solve. Especially, 
the subproblem for X 3 has a closed-form solution. Yang and Zhang [39] applied the 2-block ADMM 
to solve the following ^i-norrn regularized least squares problem (or the so-called Lasso problem 
[33] in statistics): 

mjn P\\x\\i + -||Ax- 6|| 2 , (8) 

where (3 > 0 is a weighting parameter. Therefore, the Lasso problem is in fact RLSD with one 
block of variables (more on this later). In order to use ADMM, Yang and Zhang [32] reformulated 


as 


mm 


x ,r 


Mil + l\\r\\ 2 


(9) 


s.t. Ax — r = b, 

in which the two-block variables x and r are associated with two structured functions ||x||i an d 
|| r|| 2 , respectively. Numerical experiments conducted in [39] showed that the 2-block ADMM greatly 
outperforms other state-of-the-art solvers on this problem. It is noted that the problem RLSD 0 
reduces to the Lasso problem 0 when fi and x 2 vanish and f± is the t\ norm. Problem RLSD 
0 actually covers many interesting applications in practice, and in the following we will discuss 
a few examples. RLSD 0 is sometimes also known as sharing problem in the literature, and we 
refer the interested readers to [ 2 ] and [ 21 ] for more examples of this problem. 


Example 2.1 Stable principal component pursuit m This problem aims to recover a low-rank 
matrix (the principal components) from a high dimensional data matrix despite both small entry- 
wise noise and gross sparse errors. This problem can be formulated as (see Eq. (15) ofM) : 


min ^\\L\\,+h\\S\\ 1+ l -\\M-L-S\\ 2 F , 


( 10 ) 


where M € R mxn is the given corrupted data matrix, L and S are respectively low-rank and sparse 
component of M. It is obvious that this problem is in the form of 0 with X\ = X 2 = R mxn . For 
solving (HOI) using the 3-block ADMM 0, see 


Example 2.2 Static background extraction from surveillance video 1231 \28f . This problem aims to 
extract the static background from a surveillance video. Given a sequence of frames of a surveillance 
video M € R mxn , this problem finds a decomposition of M in the form of M = ue T + S, where 
u € R m denotes the static background of the video, e is the all-ones vector, and S denotes the 
sparse moving foreground in the video. Since the components of u represent the pixel values of the 
background image, we can restrict u as be < u < b u , with bg = 0 and b u = 255. This problem can 
then be formulated as 

min U; 5 /3\\S\\i + \\\M - ue T - 5||| 

s.t. bg < u < b u . 


( 11 ) 
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Note that CD is a slight modification of Eq. (1.9) in 123) with the bounded constraints added to u 
in order to get a background image with more physical meanings. A similar model was considered 
by Ma et al. in for molecular pattern discovery and cancer gene identification. We refer the 
interested readers to 123) and 128 j for more details of this problem. 

Example 2.3 Compressive Principal Component Pursuit (38 V . This problem also considers de¬ 
composing a matrix M into a low-rank part and a sparse part as (ma. The difference is that M is 
observed via a small set of linear measurements. This problem can thus be formulated as 

min / 3 1 ||L||* + /3 2 ||5 || 1 + i||M-^(L)-^(5)|||, (12) 

where A : M mxri —>• M mxn i s a linear mapping. Note that m is an unconstrained version of 
Eq. (1.7) in (38 j/. and (fT 2 l) is particularly interesting when there are noises in the compressive 
measurements M. Similar problem has also been considered in 121 - 

In this paper, we prove that the unmodified 3-block ADMM ([5]) globally converges with any pa¬ 
rameter 7 > 0, when it is applied to solve the RLSD problem (j7|) . This result provides theoretical 
foundations for using the unmodified 3-block ADMM with a free choice of any parameter 7 > 0. 

The following assumptions are made throughout this paper. 

Assumption 2.4 The optimal set D* for problem m is non-empty. 

According to the first-order optimality conditions for 0, solving (J7D is equivalent to finding 

such that the following holds: 

fi(xi) - fi(x$) - (xi - Xi) T (Aj A*) > 0, 
f 2 (x 2 ) - f 2 (x* 2 ) - (x 2 - x 2 ) T (AjA*) > 0 , 

V/ 3 (s5)- a* = o, 

A x x\ + A 2 x* 2 + x^ = b. 

Assumption 2.5 We assume the following conditions hold. 

1. A\ and A 2 have full column rank. 

2. The objective functions fi and f 2 are lower semi-continuous, and proper closed convex func¬ 
tions. 


Vxi € Ai, 

Vx 2 € X 2 , 


(13) 
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3. fi + lxi , i = 1,2, are both coercive functions, where lx t denotes the indicator function of Xi, 


i.e., 


0 , if Xi € Xi 

+oo, otherwise. 

Note that this assumption implies that fi and / 2 have finite lower bounds on X\ and T 2; 
respectively, i.e., 



inf / 1 O 1 ) > /* > - 00 , inf / 2 (x 2 ) > f 2 > - 00 . 

X\$LX\ X2^lX2 


Remark 2.6 We remark here that requiring fi + lxi t° be a coercive function is not a restrictive 
assumption. Many functions used as regularization terms including £\-norm, t^-norm, ioo-norm 
for vectors and nuclear norm for matrices are all coercive functions; assuming the compactness of 
Xi also leads to the coerciveness of /* + lx,■ For instance, problems considered in Examvles \ 2. 1W2.31 
all satisfy this assumption. 


Our main result in this paper is summarized in the following theorem, whose proof will be given in 
Section [3]. 


Theorem 2.7 Assume that Assumptions \2.4\ and \2.5\ hold. For any given 7 > 0, let (x k , x k ,x k ; \ k ) 
be the sequence generated by the 3-block ADMM (0]) for solving (|7|) . Then any limit point of 
(x \, x k , x 3 ; X k ) is an optimal solution to problem (17|) . Moreover, the objective function value con¬ 
verges to the optimal value and the constraint violation converges to zero, i.e., 


lim 

/c—>-00 


/(4)+/ 2 (4)+/ 3 (^)-r 


= 0 , 


and 


lim 

k —^00 


Aix\ + A 2 X 2 + x k — b 


= 0 , 


(14) 


where f* denotes the optimal objective value of problem ©. 


In our analysis, the following well-known identity and inequality are used frequently, 


(w\ - w 2 ) T {w 3 - Wl) = 
T 


1 


\w 2 - re 3 || 2 - ||uq — w 2 1| 2 - \\w\ - w 3 11 2 ) , 


1 


w{w 2 > ll^ill 2 - | ||w 2 || 2 


> 0 . 


(15) 

(16) 
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Notations. We denote by f{u) = fi{ x i ) th e sum of the separable functions. We will use the 

i— 1 

following notations to simplify the presentation 


u : = 


xi \ 
X 2 

X 3 ) 


( x l \ 


,W ■■= 


X 2 

X3 


\ X 


,F(w) 


( ~ A ' X \ 
-AJ A 

-A 

y Aixi + a 2 x 2 + X 3 — b y 


(17) 


When there is no ambiguity, we often use || ■ || to denote the Euclidean norm || • || 2 . 
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3 Convergence Analysis 


In this section, we shall prove Theorem 12.71 We will divide the proof into three parts: Theorems 
13.1113. 2l and l3.3l show that the conclusion of Theorem 12.71 holds true if 7 € (1, + 00 ), 7 € (y/2 — 1, l] 
and 7 € (0, 5 ], respectively. As a result, combining Theorems 13.1113.21 and 13.31 the conclusion of 
Theorem m follows for any 7 > 0 . 


Since f 3 (x 3 ) = 411 ^ 3 11 2 h 1 ©■ the 3-block ADMM (d]) for solving 0) reduces to 


x 


k +1 _ 


7 


1 := argmin f(x 1 ) + - 


A^x\ T A 2 x 2 T x 3 — b — —A 


7 


x 2 := argmin f{x 2 ) + 

X2£%2 ^ 

k +1 _ 


Aix\ +1 + A 2 x 2 + x 3 -b- -A' 


7 


Xn : = 


+ 1 [A fc - 7 (^i^i' +1 + A 2 x k 2 +1 - b) 

A fc+1 := X k - 7 + A 2 X2 +1 + Xg +1 - b') . 

An immediate observation from (12011 and (12111 is x k = X k for any k > 0. 

The first-order optimality conditions for (fl8ll - (fT9l) are given by x k+1 € Xi,i = 1,2 and 


(18) 

(19) 

( 20 ) 
( 21 ) 


x 1 - X 3 +1 ) T f 

X 2 - X 2 + l \ 


9 i{x k+1 ) - AjX k + 7 Aj [A lX k+1 + A 2 x\ + x k -b) 
g 2 (x k+l ) — Aj X k + (^Aix\ +l + A 2 x 2 +1 + x k — bj 


> 0 , Mx 1 € X \, 


( 22 ) 


> 0 , Vx 2 € X 2 , (23) 

and 


where gi € dfi is the subgradient of /j for i = 1,2. Moreover, by combining with (12111 . 
can be rewritten as 


xi ~ ®i +1 ) gi(x k+1 ) - Aj X k+1 + 7 Aj (^A 2 {x 2 - x^ +1 ) + (x\ - x^ +1 )) 

x 2 - x 2 +1 ) 


> 0 , 


( x 2 +1 ) — aJ A fc+1 + 7 aJ 4 


J (Ax - M:+A 


^3 y 


> 0 . 


(24) 

(25) 


3.1 Proof for 7 E (l,+oo) 

In this subsection, we prove that the 3-block ADMM (118H - (I21[) is convergent for any 7 G (1, + 00 ). 

Theorem 3.1 Let (x k ,x 2 ,x k ,X k ) be generated by the 3-block ADMM (fT8ll - ([2TTl . and 7 G (1, + 00 ). 
Then (x k ,x 2 ,x k ,X k ) is bounded, and any of its cluster point [x\ , x 2 , £3 , A*) is an optimal solution 
of (0. Moreover, (fT4l) holds. 


Proof. Note that the augmented Lagrangian function is 

1 / *y 

£ 7 (x i,x 2 ,x 3 ] A) = fi(x 1 ) + f 2 (x 2 ) + -\\x 3 \\l-(X,Aixi+A 2 x 2 + x 3 -b) + A\\A l xi + A 2 x 2 + x 3 -b\\l. 






















The following inequalities hold: 

r (ryk ryk. \fc\ _ P /_.fc+l „fc. \k\ 

/ -'7V*'1> x 2’ x 3> A 1 ■ i -'7t x l ) X 2’ X 3’ A 1 

= - /i(+ +1 ) - {\ k ,AiXi - A lX k+1 ) 

+ ^\\AiX k + ^ 2^2 + ^3 - 6 H 2 - 3(||Ai + +1 + ^ 2^2 + X k - b\\l 

> gi(x k+1 ) T (xf - + +1 ) - (A fc , - A lX k+l ) 

+ 7 (AisJ - A 1 xj !+1 ) T (A 1 xJ +1 + A 2 x§ + x§ - 6) + - Al 1 xJ +1 ||2 

> IPi+ - ^i*i + 1 ll 2 > 

where the first inequality is due to the convexity of /1 and the identity m , and the 
inequality is obtained by setting x\ = x\ in (1221) . Similarly, 


r (ryk+l ryk ryk. \fc\ _ P (ryk + l „,fc+l „fc. \fc\ 

*-7t x l ! X 2> X 3! A / *-7t x l ) x 2 > X 3) A ) 

= f2(x k 2 ) - / 2 (® 2 +1 ) - ( + + 2®2 - +^ +1 ) 

+£p 1 xj ,+1 + ^ 2^2 + x k - b\\l - %\\Aix k+1 + A 2 x 2 +1 + X k - b\\ 2 

> g 2 (x k+1 ) T (x k - x k+l ) - (X k , A 2 x k - A 2 x k+1 ) 

+7(^2^2 _ A^^y {Aix k+1 + A 2 x k+l + x k - b) + ^\\A 2 x 2 - A 2 x k+1 \\l 

> l\\A 2 xl~ A 2X y i \\l, 


where the first inequality is due to the convexity of f 2 and the identity m, and the 
inequality is obtained by setting x 2 = x 2 in (1251) . By (1201) . it is easy to show that 


Cyx k +\x k +\x k -,X k ) - £yx k +\x k+ \x k+1 -,X k ) > 


In — X 


fc +1 


Combining (1201) . (1271) and (f28l) yields 


r (ryk ryk ryk \ fc \ _ P (ryk + l fc + 1 fe+1 \fc\ 

X '7V X 1 ) X 2 , X 3 , /V ) l~r~{ \ .L J , X 2 , X 3 , /\ ) 


> 


1\\A\xl - Aixl 


fe+i 112 


+ l\\A 2 x k -A 2 x k+l \\ 2 + ^ 


fc+i n 2 

x 3 II • 


By ([20]) and (12TT) . it is not difficult to get A fc+1 = x k+1 , and 


r (rr k+1 ryk+l fc+1 \K\_P /+.«+! «+I ft+1 U’+h „ 7| U* + l_~ 

- t -'7l x l ! x 2 > x 3 > A 1 ■‘■"yl+l 5 x 2 ; x 3 ) A ll x 3 x 3 


fc+1 fc + 1 fc + 1 \fc+l\ _ 


1 


fc II 2 


Combining (1201) and (1501) yields, 

r (ryk ,yk fc \fc\ _ P (ryk + l fe+1 _,fc + l \ fe+1 \ 
z -7l x l j x 2> x 3 > A I j x 2 i x 3 j a ) 

> l\\A lX k - A lX k+1 f + l\\A 2 x k - A 2 x k+1 f + (3±i - i) ||x| - x * +1 || 2 

> M(\\A\x k - A lX k+1 \\ 2 + \\A 2 x% - A 2 X 2 +1 || 2 + ||x§ - X 3 +1 || 2 ), 


where 


M := min 


7 7+1 
2 ’ 2 



> 0 , 


( 26 ) 


second 


(27) 


second 

(28) 


(29) 


(30) 


( 31 ) 
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because of the fact that 7 > 1. Therefore we know that C-y{x\,x k ,x k , X k ) is monotonically decreas¬ 
ing. Now we show that the augmented Lagrangian function has a uniform lower bound L* := f*+f 2 . 
In fact, we have the following inequality: 


r ( T k+l T k+1 r k+1 X fc+1 
/ -"Y \ J/ 1 ! x 2 > x 3 > A 


-7 gi 
Js+li 


( '3 

Jfc+B 


= fi(x k+1 ) + f 2 (x k+1 ) + - 


= h(x k+1 ) + f 2 (x% +1 ) + - 

> fi+f 2 =L*, 


v .^+1 


- l\ k+1 ^AiX k+l + - k+1 


-*) + ? 


i=l 


Y^A lX k+1 + x k+1 -b 


2—1 


X- 4 

i=l 


,X 


. fc +1 


-6 


+ 


7 - 1 


X>^* +1 + 


x 3 +1 - b 


2 — 1 


(32) 


where in the second equality we used the fact that Xg -1-1 = A fc+1 . Note that (13TT) and (1321) imply 
that {(x\, x 2 ) : k = 0,1,...} is bounded by using the facts that x k € X\, x 2 € X 2 and fi + lx t and 
f 2 + 1% 2 are coercive. Note that (13X1) and (13X1) also imply that C 1 {x\,x 2 ,x|; X k ) is convergent. 

By combining (1311) and (1321) we know that the following holds for any integer K > 0: 


K 


k =0 


^ ( A ± x k - A\x\ 


,k +1 


+ 


A 2 x k - A 2 x k 2 


.fc+i 


+ 


_ rr- 

X3 Ju 


k +1 
3 


< 


\ (r (rr-k ~.k \k\ r ( k+l k+ 1 k+ 1 \fc+l\\ 

— X 2 , X 3 , A ) L, 1 yx l ,X 2 ,X 3 ,A 

1 k =0 

-g (£ 7 (x?,x£,a^,A°) - £ 7 (xf + \ xf +1 , xf +1 , A^ +1 )) 


< (/: 7 (x?,x^xg,A°) ~L*) 


By letting K —>■ +00 we obtain 


X 

k =0 V 
and hence 


— x 


,/c+l 


+ 


3 2 x 2 - 3 2 x 2 +1 


+ 


X3 - x k+1 ) < jg (X 7 (x?,x°,x°, A°) - L *) < 00 , 


lim (|| A x x\ - A lX k+l \\ + \\A 2 x k 2 - A 2 x k+1 1| + ||x| - x| +1 ||) = 0. (33) 

k —^00 

By using m, A fc = x 3 , and the boundedness of {{x k ,x 2 ) : k = 0,1,...}, we can conclude 
that {(x k ,x 2 ,x k ,X k ) : k = 0,1,...} is a bounded sequence. Therefore, there exists a limit point 
(x}, x 2 , x 3 , A*) and a subsequence {k q } such that 

lim x kq = x*,i = 1,2,3, lim X kq = A*. 

< 7 —KX) 1 < 7 —KX) 
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By using (l33|) . we have 


lim x kq+l = x*,i = 1,2,3, lim X kq+1 = A*. 

q—¥ oo g—>-oo 

Since C 1 (x\, x k , x k ' X k ) is convergent, we know that 

lim C 1 {x\,x 2 ,x k ]X k ) = C^(x{, x 2 , x^; X*). 

k —^00 


( 34 ) 


By combining (1201) . (1211) . (1241) and (1251) . we know the following relations for any x\ € X\ and 
X 2 € X2: 


~ /i(^ +1 ) + (s! - x kq+L ) -AjX k « +1 + 7 Aj (A 2 (x 2 q - x kq+l ) + (x* q - x kq+1 )) 


&g + l\ 


^0 + 1 \ 


foq~h 1 \ \ 


/2(a:2) - f2(xl q + 1 ) + (x 2 -X 2 


kn-\~l 


T 


-^4jA ^ +1 + 1 Aj (x kq -xl q 


T ( m kq _fcg+l 

3 

k q +l \fc„+l 


X, 


> 0 , 

> 0 , 

- \ K *+ l = 0 , 


A lX \ q+l + A 2 x 2 q+l + x k 3 q+1 -b-- (A fc ^ - A fc * +1 ) = 0 . 


7 

Letting q —» + 00 , and using (1331) and the lower semi-continuity of 7i and 72, we have the following 
relations for any x± € X\ and X 2 € X 2 : 

/i(*i) - /i(®i) - (xi - ^i) T (A^A*) > 0, 

72 ( 2 : 2 ) - 72 ( 2 : 2 ) - ( 2:2 - 2 : 2 ) T (A[A*) > 0, 

X 3 — A* = 0, 

A\x\ + 2 ^ 22:2 + 2:3 — b = 0. 

Therefore, (xl,x 2 ,x^, X*) satisfies the optimality conditions of problem 0 and is an optimal solu¬ 
tion of problem 0 . 

Moreover, we have 

fc-i 


||Ai£f + A 2 ®§ + xg- 6 || = —||A ft_i — A fc || ->0, when k —>• +00 


7 


and 


7(2:1) + 72(2:2) + ^ 11^3 ii 2 - r 


< 


/l 7 (xi, 2: 2 , 2; 3 , A ) — C^(x\,x 2,2:3, A*) + ||A || • ||7lix 1 + ^ 22:2 + 2:3 — 6|| 
+—||^4i£i + ^ 22:2 + 2:3 — b\\ 2 —>• 0 , when k 00 , 


where we used (1341) . Therefore, © is proven. 


□ 
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3.2 Proof for 7 E (a/2 — 1,1] 

In this subsection, we prove that the 3-block ADMM (1181) - (121|) is convergent for any 7 G (y/2 — 1, 1]. 

Theorem 3.2 Let (x k , x 3 , A fc ) 6 e generated by 3-block ADMM (fTSlhfl?!!) . and 7 G (\/2 — 1,1]. 
Then [x k , x§, Xg, A fc ) is bounded, and it converges to an optimal solution of (0, which further 
implies that (ED holds. 


Proof. Let (x*, x 2 , Xg, A*) G 12*. By setting xi = x^ in ([Mil and X 2 = x 2 in (1251) . we get, 
T r 


(x^-x^ 1 ) [g 1 (x k 1 +1 )-Aj\ k+1 + 'yAj (A 2 {x k -x k+1 ) + {x k -x k+1 j) 


> 0 , 


-st 1 ) 


T r 


9 2 


(x^ + 1 )-AjA fc+1 + 7 Aj (x: 


a _ 


> 0 . 


(35) 

(36) 


From the optimality conditions (fl3l) . and (12T1) . we can get 
1 


7 


(X k - A fc+ 1 ) T (A fc+1 - A*) - (A fc - A fc+ 1 ) T ((A 2 X 2 - A 2 x k+1 ) + {x k 3 - x* +1 )) 


+ r r(A 2 X 2 +1 - A 2 X* 2 ) ' (A 2 X 2 - A 2 X2 +i ) + 704 +i “ X l) ' ((^2^2 - ^2^2 +i ) + ( x 3 ~ x 3 + ^) 


*7/ 


k +1 


fc +1 


*7, 


„fc+l\ 


k +1 \ 


= ~(X k - A fe+1 ) 1 (A fc+i - A*) - 7(71.2X2 - A2X2 ) 1 (xg - x| +i ) 

-7(AiXi +1 - AiXi) T ((A 2 X2 - A 2 x 2 +1 ) + (Xg - Xg +1 )) 

= (A lX k+1 + A 2 x k+1 + x k+1 - b) T (X k+1 - X*) - -f(A 2 x k+1 - A 2 x* 2 ) T {x k 3 - x k+1 ) 

- 7 (71 1 x^ +1 - A lX \) T ({A 2 x k - A 2 x k+1 ) + (x k - x k+1 )) 

= Ai(xg +1 - Xl) + A 2 (X 2 +1 - x* 2 ) + (xg +1 - Xg) (A fc+1 - A*) - 7(A 2 X2 +i - A 2 X 2 ) T {x k - Xg +1 ) 
- 7 (A 1 x^ +1 - A lX \) T {{A 2 x k - A 2 x k+1 ) + (x k - x k+1 )) 

= (A lX k+1 - A lX l) T [(A fc+1 - A*) - 7 ((A 2 x£ - A 2 x* +1 ) + ( x k 3 - x* +1 ))] + (x * +1 - x^) T (A fc+1 - A*) 

+(A2X2 fc+1 - A 2 x|) T [(A fc+1 - A*) - 7 (xg fc - x k+1 ) 

> ( x k+1 - xl) T ( gi (x k+1 ) - gi(xl)) + (x k+1 - X* 2 ) T {g 2 {x k+1 ) - g 2 {x\)) + ||x * +1 - x ^|| 2 

> l|x § +1 - xSII 2 , ( 37 ) 


where the first inequality holds by adding (l35l) and (l36p . and the second inequality holds because 
of the monotonicity of g\ and g 2 . By using the fact that x k = X k , (l37p can be reduced to 

i(A fc - A fc+ 1 ) T (A fc+1 - A*) + 7 (A 2 x ^’ +1 - A 2 x* 2 ) T {A 2 x k 2 - A 2 x^ +1 ) + 7 (x | +1 - x^) T (x^ - x k+l ) 

7 

> ||x § +1 - x5H 2 + ||x§ - xg +1 || 2 + (X k - X k+1 ) T (A 2 x k 2 - A 2 x£ +1 ) 

— 7 (xg +1 - x^) T (A 2 x 2 fc - A 2 x k+1 ). (38) 
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Now by applying (|15|) to the three terms on the left hand side of 
' 1 
.27 


we get, 


T||A‘ - A*|| 2 + l_\\A 2 4 - + l\\x k 3 - xlf 


1 


2 7 !|A fc+1 - A*|| 2 + ^\\A 2 x k+1 - yf 2 ^|| 2 + ^||4 +1 - x^|| 2 


> ||X3 +1 - X3II 2 + ||X3 +1 - X3II 2 + ^-||A fc+1 - A fc || 2 + ^\\A 2 X2 +1 - A2X2 II 2 + -||X3 +1 - X3II 2 

+(A fc - A k+1 ) T (A 2 x k 2 - A 2 x k 2 +1 ) - 7 (® 3 +1 - x* 3 ) T (A 2 x k - A 2 x k+l ). 


(39) 


By applying (fT 6 l) . we have 

T 


- 7 (®3 +1 - ^ 3 ) {A2X 2 - ^2^2 +1 ) > -7 

From (l39jh (|40l) and the following identity 
1 , . ,2 


/v.fc+1 _ 

x 3 x 3 


A 2 x k+1 - A 2 x k 2 


(40) 


7 


A fe+1 — A fc || +(A fc+1 -A fc ) (^A 2 x k+1 - A 2 x^j + ^A 2 x k+1 - A 2 x^ 

yi (a 2 x* +i - a 2 i‘) 


- ( A fc+1 - A fc ) + 
7 


we have 


1 

27 


A fe - A* 


+ 


7 


A 2 X '2 - ^2X2 


+ 


7 


Xq — Xq 



■ 1 

A fc+1 - A* 

2 y 

+ — 


[27 


2 

> (1 

- 7 ) 

av>* 

x 3 x 3 

2 +0 


A 2 X2 +1 - ^2^2 


+ 


7 


7 _ J_ 

2 27 


.T+i 


„fc+i 


+ 


(>« _ A ‘) + (.4 2 x‘« - A 2 x‘) 


> 0 , 


(41) 


where the last inequality holds since 1 + ^ ^ > 0 due to the fact that 7 € (\/2 — 1, l]. In other 

words, 27 11 A fc — A *|| 2 + ^ ||x 4 2 X 2 — ^ 42 X 2 || 2 + 3r ||x| — x ^] 2 is non-increasing and lower bounded, and 
thus it is convergent. This further implies that ||x 3 +1 — x||| —>■ 0 from (1411) . Hence, ||A fc+1 —A fc || —>• 0. 
Finally, again from (HIT) we have ||x 4 2 X 2 +1 — ^ 2 X 3 1| —> 0 . 

Since (HT1) also shows that ^||A fc — A* || 2 + ^||x 4 2 X 2 — x 4 2 X 2|| 2 + 7 IIX 3 — x §|| 2 is upper bounded, we can 
conclude that { (x k ,x 3 , X k ) : k = 0,1,...} is bounded because x4 2 has full column rank. It follows 
from ([HD and the fact that A\ has full column rank that {x^ : k = 0,1,...} is bounded. Therefore, 
there exists a limit point (xi,x 2 ,X 3 , A) and a subsequence {k q } such that 

lim x fe? = Xi,i = 1,2,3, lim X kq = X. 

>• 00 g—>-oo 

By ||x 4 2 X 2 +1 — ^ 4 2 X 2 1| —»■ 0, Hxg " 1-1 — X 3 H —>• 0 and ||A fc+1 — A fc || —»■ 0, we have 

lim x^ q+1 = Xi,i = 2,3, lim X kq+1 = X. 

q—> 00 g—>-oo 
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By the same argument as in Theorem 13.11 we conclude (xi, x 2 , X3, A) is an optimal solution of (fT|). 

Finally, we prove that the whole sequence x 2 , x k , X k ) converges to (xi, x 2 , X3, A), which gives 
the conclusion of Theorem 13.21 and also implies (fTTD . It suffices to prove that {A\x\, A 2 x k , x|, X k ) 
converges to (A 1 X 1 , A 2 X 2 , £ 3 , A) since A\ and A 2 both have full column rank. Note that since 
(xi, X 2 , X 3 , A) is an optimal solution of (JTJ), (H4l) holds with (x|,x 3 ,A*) replaced by (x 2 ,X 3 ,X). 
Therefore, 11— A|| 2 + ^||^ 2^2 — ^- 2 ^ 2 1| 2 + 2 ll x 3 — ^3|| 2 * s non-increasing. Moreover, we have 
■^\\X kq — A || 2 + ^||^ 42 ® 2 9 — x4 2 x 2 || 2 + ttIIxj 9 — X 3 H 2 —>■ 0. Therefore, it follows that 

— IIA^' — A || 2 + ^\\A2X2 — A2X2W 2 + ^ 11^3 - X3II 2 —> 0 , 

i.e., the whole sequence of (A 2 x^, x 3 , X k ) converges to (x4 2 X2; ^ 3 , A). Furthermore, \\Aix\ — j4iXi|| —>■ 
0 by using (12TT) . This completes the proof. □ 


3.3 Proof for 7 G (0, |) 

In this subsection, we prove that the 3-block ADMM (I18l) - (|21|) is convergent for any 7 G (0, |). 

Theorem 3.3 Let (x 3 , x 2 , x 3 , X k ) be generated by 3-block ADMM (fT8]) - (|2l]l . and 7 G (0, |]. Then 
(x\, x 2 , x 3 , X k ) is bounded, and it converges to an optimal solution of (ED, which further implies 
that C 3 D holds. 


Proof. Let (x*, x 2 , x 3 , A*) G Ll*. By setting X 2 = x 2 in (1251) . and X 2 = x k+1 in (1251) for the k-th 
iteration, we can obtain 


(A - 4 +1 ) T [is W‘) - A! A“ +1 + lAi 75 - > 


„fc+l 


(x k+1 - x k 2 ) 1 g 2 (x k 2 )-A^X k +^ (xt 1 - 


„k\T 


T ^fc +1 
T\fc 


0 , 


T ( Ji -1 


> 0 . 


(42) 

(43) 


Summing (1421) and 

(a 2 x* +1 - A 2 xi) ' (A fc+1 - A fc ) 


yields 
T 


> [x% +1 -x% 


T 


92{x k 2 +1 ) - g 2 (xl) + [A 2 x k 2 +1 - A 2 x k ^j (x§ - x§ X ) + - ^ 3 +1 ) 

A 2 x k+1 - A 2 x 2 


> - 


7 


2 37 

?r 

?r 

1 

2 37 

k-\- 1 k 

| CM 

1 

x 3 — X 3 

l ^ 

1 

x 3 x 3 


(44) 


where the second inequality follows from the monotonicity of and (1161) . Note that from (1161) we 
also have the following inequality: 


7 (® 3 +1 - * 3 ) (^ 2*2 - a 2 X 2 +1 ) > -27 


/v.^ + 1 _ rp* 

Jb o iL q 


7 


A 2 x 2 - A 2 x 2 


k+1 


(45) 
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Note from the proof of Theorem 13.21 that (1391) holds for any 7 > 0. By combining (1441) . (|45|) and 
m, we have 


> 

> 

> 


27 


\ k — A* 


+ 


7 


A 2 x 2 — A2X2 


Xn ~ Xc 


37 

2 


_ T k ~ 1 

x 3 x 3 


' 1 

X k +1 _ x * 

2 ry 

+ - 

[27 


2 


A 2 X 2 +1 - A 2 X* 2 


(1 - 2 7 ) 

1 + 2y — 57 


k -\-1 
2 


+ [l + ! + 2_- 3 7 


+ 1 


fc+l _ =t 


2 37 

+ 2 


T ^+l _ k 

x 3 X S 


+ 2 . 

24 


_ % 


A 2 x 2 + 1 - A 2 X 2 


27 


fc+l 


+ 


_7_ 

24 


yl 2 ^2 +1 _ A 2 x% 


0 , 


(46) 


where we used the facts that X k = x k and 7 € (0, |]. Therefore, we have ||x3 +1 — acg || —>• 0, 
||A 2 x £ +1 — A 2 X 2 H and hence ||A fc+1 — A fc || —> 0. By the same arguments as in the proof of 

Theorem l3.2l we conclude that (x k ,x k ,x k , X k ) is bounded, and any of its cluster point (aq, x 2 , X3, A) 
is an optimal solution of ([7]). Also by the same arguments as in the proof of Theorem 13.21 we can 
prove that the whole sequence [x k ,x k ,x k , X k ) converges to (xi, x 2 , X3, A), and this completes the 
proof. □ 


4 Extensions 

In this section, we give some extended results of the convergence of 3-block ADMM (|4|) for solving 
([7]) that do not restrict / 3 (x 3 ) to be ^IIX 3 1| 2 . Instead, we make the following assumption for in 
this section. 

Assumption 4.1 We assume that function fy is lower bounded by ff and is strongly convex with 
parameter a > 0 and V /3 is Lipschitz continuous with Lipschitz constant L > 0; i.e., the following 
inequalities hold: 

inf / 3 (x 3 ) > / 3 * > - 00 , 

X 3 eR p 

f3(y) > f3(x) + {y - x ) 1 V/ 3 (x) + | \\y -x || 2 , Vx, y (47) 

or equivalently, 

(y - x) T (V/ 3 (y) - V/ 3 (x)) > a \\y — x || 2 , Vx,yeM p (48) 

and 

||V/ 3 (y) - V/ 3 (x)|| <L\\y — x|| , Vx,y€lT. (49) 
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For the ease of presentation, we restate the problem fl7|) here (with f 3 (x 3 ) not restricted as ^||x 3 || 2 ) 


(50) 


as 

min /i(xi) + f 2 {x 2 ) + 73 ( 0 : 3 ) 
s.t. A lX \ + A 2 x 2 + x 3 = b, Xi £ Xi, i = 1,2, 

where satisfies Assumption 14.11 In this section, we show that the 3-block ADMM fl3J converges 
when it is applied to solve (1501) . given that 7 is chosen to be any value in the following range: 


In • j 4(7 / cr 2 (rf 2 — 2) 2 a 2 (r] 2 -2) 

1 6 IS -+ V + 4„ 2 


U 


2 L 2 4<r 

1 a 2 + -- -a, — 

71-2 7! 


I I ( \/cr 2 + 8L 2 - er 

U -5-■+“ 


(51) 


where 71 and r / 2 can be any value in (2, + 00 ). Note that if 71 is chosen such that ^/c 2 + — & > 

2^, then the second interval in d5T]i is empty. 

Note that the 3-block ADMM for solving (1501) can be written as 


" k+1 ^ fl{xi ) + l\\ AlXl + A 2 x k + 1 + x k + 1 -b-X k / 1 \\ 2 


„k+i 
x 2 
„k+1 

» L/ q 

X k+i 


:= argmin,^^ 

= argmin X2e ^ 2 f 2 {x 2 ) + ^\\A lX ^ +1 + A 2 x 2 + x% +1 -b- A fc /-y|| S 
= argmin X3eRP / 3 (x 3 ) + + A 2 a ^ +1 + x 3 - b- A fe /'y || 2 

= X k - 'y(A 1 x k+1 + A 2 x k+1 + A 3 x k+1 - b). 


(52) 


The first-order optimality conditions for the three subproblems in (1521) are given by x k+1 £ Xi and 
Xi £ Xi for i = 1 , 2 , and 

T 


Xi -x k+1 ^j 
x 2 - x k+1 ) 


T 


gi{x\ +1 ) - Aj\ k + 7 Aj ^Aix k+1 + A 2 x k + x k -bj 
g 2 (x k+1 ) -Aj X k + 7 Aj [A lX k+1 + A 2 ^ 2 +1 + x k - ft) 


> 0 , 


> 0, 


v /3(^ 3 +1 ) - x k + 7 (px^ 1 + A 2 X 2 +1 + X 3 +1 - = 0 , 


(53) 

(54) 

(55) 


where gi £ dfi is the subgradient of /* for i = 1,2. Moreover, by combining with the updating 
formula for A fc+1 , (I53l) - (l55l) can be rewritten as 


Xl -x k+1 ) [ gi { X ^)-A[X k ^ + 1 A[ ^A 2 (x k -x^) + (x k -x^)) 
— r k+1 '\ \ fe+1 4- rvA I (r k — >0, 


Si(*i +1 ) 

-AjX 

9i{x 2 +1 ) 

— aJ A 1 

+1 = 0. 





> 0 , 


(56) 

(57) 

(58) 


Before presenting our main result in this section, we give a technical lemma which will be used in 
our subsequent analysis; the proof of the lemma can be found in the appendix. 
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Lemma 4.2 Assume Assumptions \2.J\ and 12.51 hold. The following results hold for the 3-block 
ADMM (1521) applied to (1501) with f 3 satisfying Assumption \f.l\ 


!■ //7£( ^ + 2 8L2 ~- £ ,+ oo) ; 


then 


lim 

k —^OO 


A 2 X 2 + 1 - A 2 X 2 


= 0, 

lim 

™fc+i _ 

«X/ Q 4 / 0 

= 0 , 

lim 

X k+i _ X k 


k—too 

0 0 


k—> co 



= 0 , 


(59) 


{ {x\, x k , x k , X k ) :k = 0,1,2,...} is a bounded sequence and C- f (x k , x|, X 3 ; A fc ) converges to 
Cy{x\,xl,xl\ A*). 

cr 2 (ri2-2 ) 2 , o- 2 (r)2-2) 


*■ ^/7G( \/° 2 + 


u ( 0 , min {f , m andrn 


arbitrarily chosen in (2, +oo), t/ien ([59jl holds, {(x^, Xrj, x|, A fc ) : k = 0,1, 2,...} is a bounded 
sequence and the whole sequence o/{ (x\, x k , x k , X k ) : k = 0 , 1 , 2 ,...} converges to (x*, x 2 , x 3 , \* 


Theorem 4.3 Assume Assumptions \2.4\ and \2.5\ hold. Let (xf, x§, X 3 , A fc ) 6 e generated by the 3- 
block ADMM (1521) with 7 chosen as in (15T1) . T/ien (xjg x 2 , x|, A fc ) is bounded, and any of its cluster 
point (xf, x|, X 3 , A*) is an optimal solution of ([5fl]) . Moreover, we have 


lim 

k—¥ 00 


f{x\)+f2{%2) +h( x t)~ f* =°> , lim 2I1X1 + A 2 x\ + X3 - b 

Ac—>00 

where f* denotes the optimal objective value of problem (1501) . 


= 0 , 


(60) 


Proof. Since 7 is chosen as in (|5l|) . it follows from Lemma l4~2l that { [x\, x 2 , x k , X k ) : k = 0,1,2,...} 
is a bounded sequence. Hence, there exists a cluster point (x*, x 2 , x\. A*) and a subsequence {k q } 
such that 


lim x kq = x*,i = 1,2,3, lim X kq = A*. 

q—¥ 00 g—>-00 


By using (1591) . we have 


lim x 
q—> 00 


fcg+1 = x*,i = 2, 3, lim A fc<J+1 = A* 
</—>00 


For the (fc 9 + l)-th iteration, using the convexity of fi and f 2 , 
for A fc+1 can be written as 


and the updating formula 


fi(xi) - fi(x\ q+l ) + (xi - x^ +i j -AjX kq+1 + 7 Aj [A 2 (x 2 9 - x 2 q+1 ) + {x 3 q - X 3 




T 




kn-\-l 


f2(x 2 ) - f 2 (x- k 2 q+1 ) + [x 2 - x \ 9+1 j X kq+1 + 7(x^ 9 - X 

+ l \ 


T 


3 


> 0 , 

> 0 , 


vf 3 (x; q ^)-x kq+1 = 0 , 


Hrxi 9+1 + A 2 x 2 q+1 + x k 3 q+1 - b - - ^X kq - X kq+1 ^j = 0 , 
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where x\ € X\ and x 2 € X 2 . By letting q —» + 00 , and using (l59l) and the lower semi-continuity of 
fi and f 2 and the continuity of V/ 3 , we have 


72(^2) - /2OE - (x 2 - x* 2 ) t (aJ A*) > 0 , 

V/ 3 (^)-A* = 0, 
j4iX* + ^2^2 + — 6 = 0. 


This implies that (x\,x 2 ,x 3 ,\*) is an optimal solution of problem (1501) . It also follows from Lemma 
14.21 that 


lim 

k —^00 


Aix\ + A 2 x 2 + x k — b 


= lim — 

\ k - A fc+1 

k—foo y 



= 0 . 


(61) 


Moreover, if 7 € ( z, + 00 ), from part 1 of Lemma 14.21 we have 

/(4) + /2(4)+/ 3 (4)-r 

£ 7 -Cy {xl,x* 2 ,x* 3 ,\*) 


< 


+ 


Aixl + A 2 x 2 + x 3 -b 


+ 


7 


A\x\ + A 2 x 2 + x 3 — b 


which implies that 


lim 

k—} OO 


f(4) +/ 2 (h) + A(4) - /' 


= 0, 


by using part 1 of Lemma 14.21 and (1611) . 
If 7 satisfies 


(62) 


7 G 


^ 0 , min 


4a a(rj 2 - 2) / cr 2 (? ?2 -2) 2 a 2 (rj 2 - 2^ \ ■ , 

172 ’ 4 r ] 2 y 16r/2 4 t? 2 J / ^ 



- cr, 


4(7 

7i 


for arbitrarily chosen pi > 2 and 772 > 2, by using part 2 of Lemma 14.21 (1621) follows immediately 
because the whole sequence of (x k ,x 2 ,x k ] X k ) converges to [x\,x 2 ,x\\ A*). □ 


Remark 4.4 We remark here that although the range defined in (ED does not cover all values in 
(0,+oo), it shows that the 3-block ADMM applied to solve (f50l) globally converges for most values 
ofj. In Table]]} we list several cases for different values of (a, L,r)i,rj 2 ). From Table]]] we can see 
that in many cases the range defined in ED equal to (0,+ 00 ), and in some cases although the 
range is not equal to (0,+ 00 ), it covers most part 0 /(0,+ 00 ). In this sense, we can conclude that 
the choice of parameter 7 is “relatively free” for solving (1501) with f 3 satisfying Assumption\f. 1\ 
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(cr,L, 71 , 72 ) 

Range in (T5TT) 

(2, 2.1,3.5, 6.5) 

( 0 , + 00 ) 

(14,15,3.5,6.5) 

( 0 , + 00 ) 

(50,52,3.5,5) 

( 0 , + 00 ) 

(70,74,3.7, 7) 

( 0 , + 00 ) 

(200,205,3.8,8) 

( 0 , + 00 ) 

(500,505,3.9,6) 

( 0 , + 00 ) 

( 1 , 1 ,3,4) 

(0,0.5) U (0.7321, + 00 ) 


Table 1 : Range of 7 defined in (I5T1) 


5 Conclusions 

Motivated by the fact that the 2-block ADMM globally converges for any penalty parameter 7 > 0, 
we studied in this paper the global convergence of the 3-block ADMM. As there exists a counter¬ 
example showing that the 3-block ADMM can diverge if no further condition is imposed, it is 
natural to look for sufficient conditions which can guarantee the convergence of the 3-block ADMM. 
However, the existing results on sufficient conditions usually require 7 to be smaller than a certain 
bound, which is usually very small and therefore not practical. In this paper, we showed that the 3- 
block ADMM globally converges for any 7 > 0 when it is applied to solve a class of regularized least 
squares problems; that is, the 3-block ADMM is parameter-unrestricted for this class of problems. 
We also extended this result to a more general problem, and showed that the 3-block ADMM 
globally converges for most values of 7 in ( 0 , + 00 ). 
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A Proof of Lemma 14.2 

Proof. By (l58l) and the Lipschitz continuity of V/ 3 , we have 

||A fe+1 -A fe || < L\\x* +1 - x%\\. (63) 
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Letting X 2 = x k in tl ie (& + l)-th iteration and X 2 = x^ +1 in the A;-th iteration of (15711 yields 
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Adding these two inequalities, using the monotonicity of 52 and applying (1T61) we obtain that the 
following inequality holds for any e > 0: 
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Now we prove part 1 of Lemma 14.21 Firstly, we prove that C-y{w k ) is a non-increasing sequence. 
By similar arguments as in (I26|) . (12711 and (1281) . we have the following inequalities: 
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By using (f63l) . we have 
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Combining (l 66 l) and (1671) yields 
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where M := min j^, — -y j. Since 7 € ^ a2 +^ L ' 2 — Z , + 00 ^ , we have M > 0. 

Then we prove that C^(w k ) is uniformly lower bounded. Since / 1 , /2 and / 3 are all lower bounded, 
we have 
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where the first inequality holds from the convexity of / 3 and the Lipschitz continuity of V/ 3 . By 
combining (1681) and (1691) . for any integer K > 0 we have 
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which combining with (1631) yields (1591) . 

Note that (| 68 D shows that C 1 (x\, x k , x 3 \X k ) is monotonically non-increasing. This together with 
(1691) shows that C 1 {x\, x k ,x 3 ;X k ) converges to C y (xl,x 2 ,x 3 ; A*). Finally, we prove that {(xjqxr,, x 3 , A fc )} 
is a bounded sequence. Note that (fSUj) and the coerciveness of fi + lx x and /2 + lx 2 imply that 
{ (x k , x k ) : k = 0,1,2,...} is a bounded sequence. This together with the updating formula of A fc+1 
and (I59[) yields the boundedness of x k . Moreover, this combining with (|58D gives the boundedness 
of X k . Hence, { [x k ,x k , x 3 , \ k ) : k = 0,1, 2,...} is a bounded sequence. 
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Now we prove part 2 of Lemma 14.21 We first assume that 7 € 
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Now by applying (fl5|) to the three terms on the left hand side of (1701) we get 
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By combining (|63|). (1721) . (f73l) and (I7T1) . we get 
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Therefore, we conclude from ( 1751 ) - ( 1771 ) and ( 17 TT) that 


> 

> 

> 


' 1 

X k - X* 

2 

+ - 

[27 


2 


' 1 

\k+l _ A * 

2 

+ — 

[27 


2 


a 


_ 727 \ 
4 ) 


k -\-1 * 

x 3 — x 3 


A 2 X 2 - A 2 X* 2 


A 2 x2 +1 - A 2 x * 2 

+ ( a+ k~' ,c 


2 7 

k * 

2 7e 

k, /c— 1 

2" 

+ 2 

x 3 x 3 

+ T 

<-y» /v»' 

*3 .63 



+ 


7 


k +1 


+ 


76 


™.k+l rjM 

x 3 x 3 


k +1 k 
x 3 — x 3 


+ 


a 


° + o- 7 ^ 

2 7 


T k +1 _ k 
x 3 x 3 


, I 7 _ J_ _ 1 
2 r ] 2 e 


A 2 x2 +1 - A 2 x 2 


7_7. 

2 72 

* 2 




/c+1 


71 - 2*2 


2 


where the second and third inequalities hold because 7 € ^0, min j + ‘\J a ' 2 ^ r p‘ — + 

for any 72 > 2 implies 


n ^ 4cr 7 7 7 . n 

0 < 7 < — --> 0 , 

72 2 72 6 


er 


cr + --76 > 0. 

27 


This implies ||®3 +1 —®||| —t 0 , —^2^2!! “^ and hence || A fc_l_1 — | 


the sequence ^ ||A fc — A*|| 2 + ^ ||^2®2 — ^2^2 11 ~ + 2 || x 3 — 
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0 . This also implies 
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is non-increasing, 


which further implies that { (^2*2 j *31 7 / ' ) : k = 0 , 1 , 2 ,...} is bounded. Since A\ and A 2 both have 
full column rank, we conclude that { (xf, x 2 , x 3 , X k ) : A: = 0 , 1 , 2 ,...} is a bounded sequence. 


Finally, using similar arguments as in Theorem 13.21 it is easy to prove that the whole sequence 
of { (x k ,x 2 , x 3 ,X k ) : A: = 0,1,2 ,...} converges to {x\,x 2 ,x\, A*). We omit the details here for suc¬ 
cinctness. □ 
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